Variability of internal and external loads and technical/tactical outcomes during small-sided soccer games: a systematic review

Small-sided games (SSGs) are widely used in soccer training. However, some of the typical outcomes related to human responses during these games (namely internal and external load) may vary between sessions for similar practice conditions. Thus, the study of intra- and inter-bout variability in response to SSGs is progressively growing. This systematic review aimed to (1) identify studies that have examined the intra- and inter-session bouts’ variability levels regarding the internal and external load and technical/tactical outcomes during SSGs and (2) summarize the main evidence. A systematic review of PubMed, SPORTDiscus, Cochrane, and Web of Science databases was performed according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. From the 486 studies initially identified, 24 were fully reviewed, and their outcome measures were extracted and analyzed. Sixteen studies analyzed internal load, 13 studies analyzed external load variables, six studies analyzed technical execution, and two studies analyzed tactical behavior. All studies included SSGs with a range number of players between 2 to 14 (1 vs. 1 to 7 vs. 7 SSGs). Internal load and low-speed external load variables presented a low variability, while high variations were reported regarding the technical execution and high-speed external loads.


INTRODUCTION
Small-sided games (SSGs) are conditioned forms of official games in which specific task constraints are adjusted to promote new challenges in a tactical/technical dimension [1]. Such adjustments promote variations in physiological and physical demands [2]. These drill-based tasks are very popular in soccer since they seek to promote specificity of practice reflecting the dynamics of the game [3]. In fact, some training protocols use SSGs for promoting physical development in players [4]. However, since they are drill-based games-and due to the proper dynamics of the match-it is expectable that these Variability of internal and external loads and technical/tactical outcomes during small-sided soccer games: a systematic review decelerations are the most frequent outcomes presented by original articles [6,7]. Some studies have revealed that among these outcomes, total distance has relatively low intra-and inter-player variability during SSGs [20,21], while distances covered at high-intensity have relatively high variability [17,20,21].
Regarding technical actions, passes, shots, and receptions are some of the most commonly reported outcomes [8]. Regarding tactical behaviors, some principles of play related to attacking or defending, as well as exploratory behaviors related to playing position in the Cartesian space, are often presented in the literature [3,22].
Despite the small number of studies analyzing the variability of technical and tactical responses during SSGs compared to internal and external load demands, the reports suggest more variability among technical and tactical outcomes [18,23].
As presented above, studies on the within-and between-session variability of internal and external load and technical/tactical dimensions during the same SSGs have become prominent in recent years [16,17,20]. However, as far we know, no systematic review has summarized the evidence about this kind of variability across different original studies. A summary of the variability levels of SSGs (intra-SSGs variability) may provide information vital to identifying the impact of these games on different outcomes and select the most appropriate games and formats to apply to aim to ensure a proper stimulus in specific outcomes. Thus, coaches may decide to use the exercise/drill) [9]. They also encompass SSGs' effects on technical responses (technical actions and their accuracy performed during the games) and tactical behaviors (individual behaviors related to the dynamics of the game and interactions with teammates, opponents, and the ball) [10]. In fact, it is expectable a close relationship between all the above-mentioned outcomes (load, technical and tactical) namely because SSGs reproduce the dynamic of the formal match. In that sense, it is expectable that the emergent behaviors and collective and individual dynamics will change the external load of the match (e.g., influence by contextual factors) [11,12]. Such fact will promote natural consequences in the physiological responses since the well-reported relationships between some internal and external load measures [13].
In studies conducted on SSGs among soccer players, heart rate and rate of perceived exertion are the outcomes most commonly related to internal load [7,14,15]. However, in some cases, blood lactate concentration is also reported. The intra-and inter-player variability for these outcomes have been studied during SSGs, with findings suggesting that heart rate responses present lower variability [16][17][18], while blood lactate concentrations and perceived exertion are more variable [16,19].
In the case of external load, total distance, distances covered at high demands-for instance, high-speed running (> 19.8 km/h) or sprinting (> 25 km/h)-and the number of accelerations or

Inclusion criteria Exclusion criteria
Population Soccer players of any age or sex, with regular training practice and without major injury or illness.
Sports other than soccer (e.g., rugby, American football, handball, volleyball, futsal, basketball) Players with major injuries or illness.
Intervention A minimum of two bouts/sets of a SSG (within-or betweensessions). Thus, the same game was made at least twice in a single session or at least one time in two different training sessions. AND The exact same conditions of practice (e.g., same teams, same format of play) were made between repetitions SSGs with single bout/set; The bouts/sets changes the constraints (e.g., play format, court dimensions) or conditions (change teams and players within or between-sessions); The conditions changes by any exercise or test (e.g., inducing mental or physical fatigue) made between repetitions occurring in the same session Biology of Sport, Vol. 39 No3, 2022 649

Variability of SSGs
SSGs to work on some variables and other training methods to work on others.
Therefore, the first purpose of this systematic review is to identify studies that have examined the impact of intra-and inter-SSG bouts/sets on soccer players' variability levels of internal and external load and technical/tactical outcomes. The second purpose is to summarize the main evidence presented in the literature. However, in some cases, the specific instruments may be the cause for the variability, and so special attention will be given to that fact during the synthesis of results of the current systematic review.

MATERIALS AND METHODS
The systematic review strategy was conducted according to PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-analyses) guidelines [24].

Eligibility criteria
The inclusion and exclusion criteria can be found in table 1.
The screening of the title, abstract and reference list of each study to locate potentially relevant studies was independently performed by the two authors (MRG and JA). Additionally, they reviewed the full version of the included papers in detail to identify articles that met the selection criteria. An additional search within the reference lists of the included records was conducted to retrieve additional relevant studies. A discussion was made in the cases of discrepancies regarding the selection process with a third author (HS). Possible errata for the included articles were considered.

Information sources and search
Electronic databases (PubMed, SPORTDiscus, Cochrane and Web of Science -core collection) were searched for relevant publications prior to the February 9 of 2021. Keywords and synonyms were entered in various combinations: title (i.e., "Soccer" OR "Football") AND title ("small-sided" OR "SSG" OR "conditioned") AND in the title, abstract or keywords ("varia*" OR "reproducibility" OR "repeatability" OR "reliability"

Assessment of methodological quality
Adapted version of STROBE assessment was used to evaluate the included articles´ eligibility [26]. Any disagreement was discussed and solved by consensus decision. Each of ten items was qualified using numerical codification (1 = considered or 2 = non-considered).
Those studies with more than 7 complete items (score of 7 is not included), are considered as a low risk of bias.

Study identification and selection
The searching of databases identified an initial 483 titles, and 3 was found from external sources. These studies were then exported to were excluded owing to a number of reasons including: exclusion criteria 2, 4, and 6. Therefore, 24 articles were eligible for the systematic review ( Figure 1). The twenty-four studies included provided mean and standard deviation of reliability data.

Methodological quality
The table 2 presents the summary of methodological assessment.
From the 24 included articles, nine were classified as low methodological quality (37.5%), while the remaining were classified as high-quality.

Study characteristics
The characteristics of the studies included in the systematic review can be found in

Significant or meaningful differences between sets/ repetitions (within-session WS and between-sessions BS)
Lowest and the highest sets/repetitions (within-session) % of change between the lowest and the highest sets/ repetitions (within-session) [

Significant or meaningful differences between sets/ repetitions (within-session WS and between-sessions BS)
Lowest and the highest sets/repetitions (within-session) % of change between the lowest and the highest sets/ repetitions (within-session) [

Significant or meaningful differences between sets/ repetitions (within-session WS and between-sessions BS)
Lowest and the highest sets/repetitions (within-session) % of change between the lowest and the highest sets/ repetitions (within-session) [ IL: internal load; ICC: intra-class correlation; %CV: percentage of coefficient of variation; ND: non-described; NA: non-applicable. RPE: rated perceived exertion; HR: heart rate; Avg: average; HRres: heart rate reserve; HRmax: heart rate maximum; La -: lactate; red zone: > 80% of maximal HR; *: non-extractable data.

Significant or meaningful differences between sets/ repetitions (within-session WS and between-sessions BS) lowest and the highest sets/ repetitions (within-session)
% of change between the lowest and the highest sets/repetitions (within-session) [ TAcc: 2.2 ± 1.0 -2.9 ± 0.

Significant or meaningful differences between sets/ repetitions (within-session WS and between-sessions BS) lowest and the highest sets/ repetitions (within-session)
% of change between the lowest and the highest sets/repetitions (within-session) [

Significant or meaningful differences between sets/ repetitions (within-session WS and between-sessions BS) lowest and the highest sets/ repetitions (within-session) % of change between the lowest and the highest sets/repetitions (within-session)
[ EL: external load; ICC: intra-class correlation; %CV: percentage of coefficient of variation; D: distance; TD: total distance; TAcc: total acceleration; nr: number; TDec: total deceleration; PL: player load; MW: mechanical work; MAV: maximal aerobic velocity; MIV: maximal intermittent velocity; *: non-extractable data.

Lowest and the highest sets/repetitions (within-session)
% of change between the lowest and the highest sets/repetitions (within-session) [

Lowest and the highest sets/repetitions (within-session)
% of change between the lowest and the highest sets/repetitions (within-session) [46] 3 vs 3 (120 rest)

Results of individual studies: variability of internal load during SSGs
The synthesis of results can be found in table 5. There were 13 studies that analysed HR, six studies that analysed RPE, four studies that analysed lactate and one study that analysed TRIMP. In addition, there were six studies where was not possible to extract mean and standard deviation of the variables analyse, nine studies that did not present ICC or % CV and two studies where was not possible to extract any data.

Results of individual studies: variability of external load during SSGs
The synthesis of results can be found in table 6. There were 11 studies that analysed distance covered variables, three studies that analysed acceleration, two studies that analysed deceleration, five studies that analysed player load, and one study that analysed mechanical work. There were three studies where was not possible to extract mean and standard deviation of the variables analysed, seven studies that did not present ICC or % CV and one study where was not possible to extract any data.

Results of individual studies: variability of technical execution during SSGs
The synthesis of results can be found in

Results of individual studies: variability of tactical behavior during SSGs
The synthesis of results of the two studies that include behavior variables can be found in

Variability of internal load during SSGs
In the current systematic review, large within-session variability was observed for the RPE and the time ≤ 89% of the HRpeak (~15-44% of change between the lowest and highest sets/repetitions) [27]. In contrast, the %HRAvg, %HRpeak, %HRmax showed small withinsession variations (~0.5-6% of change between the lowest and highest sets/repetitions) [21,[27][28][29][30][31][32]. Perceived efforts are expected that increase across the sets/repetitions within sessions, especially when intra-player responses are analyzed. However, the possible variability of IL between teammates during the same SSGs (depending on the positional role and contextual factors, among other aspects) should be carefully analyzed by coaches so that they can properly compensate the training with more analytical tasks [5].
On the one hand, an analysis of Laconcentration in professional players showed a smaller format (e.g., 3 vs.
These results suggest that highly demanding running speeds are highly variable within-session, and this could be compensated by planning training sessions with more analytical tasks [5].
However, in general, lower within-session variations were observed for the mechanical load derived from inertial sensors/accelerometers (e.g., player load) compared to distances covered in high-speed zones (e.g., distance > 19.8 km·h -1 ) and events related to changes in speed (e.g., accelerations/decelerations).
Similar to the previous discussion on IL variability, only two studies investigated the between-session variability of EL during SSGs.
In addition, a previous umbrella review reported that running demands during SSGs are highly dependent on tactical issues (e.g., rules, coaches' intervention/encouragement, scoring line) [38]. Therefore, using mechanical load derived from the inertial sensors/accelerometers seems to be the most stable analysis method and might be the best way to monitor fitness and fatigue during SSGs [39].
SSGs, as well as official soccer matches, can be considered dynamic systems that involve relationships between two teams under the influence of different positional and contextual factors [5]. Therefore, all training scenarios involve some level of unpredictability, which naturally leads to an increase in the variability of stimuli [16].
This variability is essential for developing the tactical and technical aspects of the game and, in turn, solving problems that emerge during SSGs (see the discussion about within-and between-session variations of technical-tactical outcomes). However, considerable variability between teammates and sessions may not be ideal for developing physiological and physical traits. A more controlled variability level might be better when considering that the training load should be logically progressed, individualized, and standardized [40].

Variability of technical execution during SSGs
Regarding technical execution, considerable within-session variation was reported in the selected studies, while no study has investigated between-session variations, which signifies a gap in the literature. From an ecological perspective, motor responses arise due to the emerging problems in a given task [41]. For this reason, when adopting SSGs, it seems plausible to expect that players' responses will be highly variable, as they can adapt their behavior to create novel contexts. For example, a recent study showed that players' behavioral efficiency was higher in the last bout than in the first [42], which supports the rationale that different technical executions are likely to be observed over successive bouts (within-session variation).
Also, the task constraints seem to indicate players' preferable methods for solving the emerging problems [43,44] Higher variability naturally reduces prescription quality, as the coach is unlikely to determine the exact stimuli being experienced by the players. At this point, we recommend exercising caution when designing SSGs to promote specific technical actions, especially in high-performance contexts. On the other hand, it has been proposed that increasing variability is required to nurture players' creativity, especially in the early stages of deliberate practice sessions [47]. For this reason, the large variability in technical execution should not be seen as an inherently negative aspect of SSG training but instead as a characteristic that might be considered when prescribing different task conditions.

Variability of tactical behavior during SSGs
The high variability of tactical behaviors during SSGs is expected considering the previously mentioned rationale regarding the unpredictability of the actions in game-based scenarios. This feature of SSGs might not be seen as inherently negative, either, as it might nurture players' creativity [47], as has also been mentioned. However, the current findings are limited in terms of eliciting a discuss this topic, as only one study met all the inclusion criteria.
Only one study [48] evaluated the within and between-session tactical variability in SSGs. Specifically, this study found no within-or between-session differences in tactical behavior measured by positional data. This result is different from those of a previous study that reported mainly weak ICC values for the within-session reliability of the frequency of tactical actions (core tactical principles) [49]. We argue that the difference between those studies is related to the characteristics of the variables as was previously introduced regarding the technical execution. Specifically, even if high behavioral variability could be expected in game-based scenarios (like SSGs), we argue that this variability will be enhanced when analyzing discrete low-frequency variables.
In the study of Bredt et al. [49], the assessed data corresponded to the tactical principles performed by each player. Meanwhile, Frencken et al. [50] collected positional data at 45 Hz using a local positioning system, which significantly increased the sample used for analysis. However, since few studies included data on the reliability and variability of tactical behavior in SSGs, we recommend further investigation on this topic. Specifically, future studies testing different task constraints could include data regarding the ICC or the CV of the variables to allow the reader to understand the expected variability of each task condition.

Study limitations, future research, and practical implications
Few studies have investigated the between-session variability of internal load (n = 2), external load (n = 2), technical outcomes (n = 0), and tactical outcomes (n = 1) during soccer SSGs, particularly in young and professional elite-level players. Furthers studies should fill this gap in the literature. In addition, future studies should test whether including more restrictive task constraints reduces the variability of internal/external load and technical/tactical outcomes during SSGs. Regarding the methodological quality assessment, ~40% of the included studies presented a low level, which might represent a methodological limitation of the included results.
Coaches should consider three main practical implications when planning SSGs: i) %HRAvg, %HRpeak, %HRmax (more stable within sessions), and RPE scores (more stable between sessions) seem to be the best IL indicators; ii) mechanical load derived from the inertial sensors/accelerometers seems to be the most stable level of analysis and may have greater potential for monitoring fitness and fatigue; and iii) large variability in technical/tactical outcomes should not be seen as an inherently negative aspect of the training process with SSGs but as a characteristic that might be considered when prescribing different task conditions. Possibly, the dynamic of the games and some specific conditions as pitch size or goal-setting can play a determinant role to modulate the variability of the high-demanding match running and technical skills, mainly in cases in which few frequencies of events occur and in which standard deviation may cause a considerable impact on the variability.

CONCLUSIONS
The current systematic review allowed to identify that some of the measures related to SSG responses can be more or less variable and this should be carefully understood by the coaches. In summary, it was found that internal load and low-speed external load variables presented low variability between repetitions and sessions for the same format, while high variations were reported for technical execution and high-speed external loads. Therefore, the use of SSGs should be planned based on the type of exposure selected by the coach.
Eventually, for cardiorespiratory-based stimulus, the SSGs can be interesting since they present stable and low-variable stimulus in terms of internal load demands. However, for promoting mechanical stimulus while performing high-intensity runs, eventually, SSG can be too heterogenous and variable within and between-players, and maybe running-based exercises should be more recommended [51,52]. Therefore, it is important to highlight such variability levels, at least to recommend a stronger monitoring process to control the dose imposed and adjust based on the player's needs.

Conflicts of interest
All the authors declare that they have no conflicts of interest relevant to the content of this review.